28 research outputs found
Financial Numeric Extreme Labelling: A Dataset and Benchmarking for XBRL Tagging
The U.S. Securities and Exchange Commission (SEC) mandates all public
companies to file periodic financial statements that should contain numerals
annotated with a particular label from a taxonomy. In this paper, we formulate
the task of automating the assignment of a label to a particular numeral span
in a sentence from an extremely large label set. Towards this task, we release
a dataset, Financial Numeric Extreme Labelling (FNXL), annotated with 2,794
labels. We benchmark the performance of the FNXL dataset by formulating the
task as (a) a sequence labelling problem and (b) a pipeline with span
extraction followed by Extreme Classification. Although the two approaches
perform comparably, the pipeline solution provides a slight edge for the least
frequent labels.Comment: Accepted to ACL'23 Findings Pape
FinRED: A Dataset for Relation Extraction in Financial Domain
Relation extraction models trained on a source domain cannot be applied on a
different target domain due to the mismatch between relation sets. In the
current literature, there is no extensive open-source relation extraction
dataset specific to the finance domain. In this paper, we release FinRED, a
relation extraction dataset curated from financial news and earning call
transcripts containing relations from the finance domain. FinRED has been
created by mapping Wikidata triplets using distance supervision method. We
manually annotate the test data to ensure proper evaluation. We also experiment
with various state-of-the-art relation extraction models on this dataset to
create the benchmark. We see a significant drop in their performance on FinRED
compared to the general relation extraction datasets which tells that we need
better models for financial relation extraction.Comment: Accepted at FinWeb at WWW'2
TRACCS: Trajectory-Aware Coordinated Urban Crowd-Sourcing
We investigate the problem of large-scale mobile crowd-tasking, where a large pool of citizen crowd-workers are used to perform a variety of location-specific urban logis-tics tasks. Current approaches to such mobile crowd-tasking are very decentralized: a crowd-tasking platform usually pro-vides each worker a set of available tasks close to the worker’s current location; each worker then independently chooses which tasks she wants to accept and perform. In contrast, we propose TRACCS, a more coordinated task assignment ap-proach, where the crowd-tasking platform assigns a sequence of tasks to each worker, taking into account their expected location trajectory over a wider time horizon, as opposed to just instantaneous location. We formulate such task assign-ment as an optimization problem, that seeks to maximize the total payoff from all assigned tasks, subject to a maximum bound on the detour (from the expected path) that a worker will experience to complete her assigned tasks. We develop credible computationally-efficient heuristics to address this optimization problem (whose exact solution requires solving a complex integer linear program), and show, via simulations with realistic topologies and commuting patterns, that a spe-cific heuristic (called Greedy-ILS) increases the fraction of assigned tasks by more than 20%, and reduces the average detour overhead by more than 60%, compared to the current decentralized approach
TASKer: Behavioral insights via campus-based experimental mobile crowd-sourcing
National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding Initiativ